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A SCHEDULER DEVICE FOR A SYSTEM HAVING ASYMMETRICALLY- 
SHARED RESOURCES 

The present invention relates to a scheduler, also 
referred to as service discipline, for a system that 
5 comprises a plurality of nodes sharing a plurality of 
resources such as wavelengths . 

Such a system is constituted, for example, by an 
optical packet ring network of the dual bus optical ring 
network (DBORN) type. The architecture of the ring is 

10 organized around a concentrator and is constituted by a 
plurality of nodes such as optical packet add/drop 
multiplexers (OPADMs) , each node being in communication 
with the concentrator. The network contains a write bus 
corresponding to a plurality of "up" wavelengths and a 

15 read bus corresponding to a plurality of "down" 

wavelengths. The up and down wavelengths are usually 
multiplexed on the same fiber and are used and thus 
shared by the nodes of the network for sending and 
receiving packets to and from the concentrator. A 

20 plurality of nodes thus share a common resource such as a 
wavelength for receiving packets sent by the concentrator 
which can be considered as source node. 

However, in order to take account of the specific 
features of each node, all of the nodes do not 

2 5 necessarily share the same resources. Thus, it can 

happen that a resource is shared by a fraction only of 
the nodes of the network. 

Since each of the nodes does not share the same 
resources as the other nodes in the same proportions, the 

30 resources are said to be shared asymmetrically. 

One of the functions of networks relates to service 
discipline, i.e. the fact of determining amongst a 
plurality of waiting queues or buffers, which packet 
associated with a position queue is to be sent over a 

35 node. This determination is performed by a device 
referred to as a scheduler. 



2 



The present invention provides a scheduler device, 
also known as service discipline, for a system comprising 
a plurality of nodes that share a plurality of resource 
such as wavelengths in asymmetric manner. 
5 To this end, the present invention provides a 

scheduler device for scheduling the transmission of data 
from a plurality of queues in a source node to a 
plurality of destination nodes via a plurality of outlet 
ports from said source node, each of said outlet ports 

10 being associated with a resource, the data being 

transmitted via said resource to said destination node, 
each of said nodes receiving data from all or some of 
said plurality of resources, said scheduler device being 
characterized in that it has a plurality of servers, each 

15 of said servers being associated with a respective one of 
the resources of said plurality of resources and each of 
said servers including scheduler means, said scheduler 
means being independent for each of said servers. 

By means of the invention, each server operates 

20 independently of the other servers and can take account 

of the specific features of the resource with which it is 
associated, and in particular the fact that a resource is 
not shared uniformly by all of the destination nodes, 
each node making use of said resource with a certain 

25 weighting coefficient. This weighting coefficient may be 
zero if the node does not use said resource. The 
coefficient may itself be weighted depending on the 
importance of that resource for the destination node. 
Thus, a resource that is used by a first node and by a 

3 0 second node is not shared in the same manner by the first 
node and the second node if the first node makes use of 
more other resources than does the second node. For 
example, each server can take two weights into 
consideration: a first weight providing information about 

35 the use of the resource by the node and representing the 
asymmetry of the system; and a second weight giving 
information about the ratio with which that resource is 
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used by the node as a function of the traffic destined 
for said node relative to the total traffic. 

In an embodiment, said scheduler means comprise a 
plurality of stages corresponding respectively to a 
5 plurality of scheduling schemes using different criteria. 

In an embodiment, said scheduling means comprise 
cyclical scheduling means of the round robin type. 

The round robin scheduler means scan sequentially 
and cyclically the first-in first-out (FIFO) type queues 
10 and serve the first non-empty queue that is ready. If a 
queue is empty, then the scheduler means move onto the 
following queue. Some queues may be privileged by 
defining a weight, corresponding, for example, to the 
number of elements or packets that the scheduler may take 
15 from the head of the queue; this is referred to as a 
weighed round robin (WRR) . 

In another embodiment, said scheduler means include 
weighted fair queuing (WFQ) scheduler means. 

This algorithm gives priority treatment to low 
20 volume flows and enables large volume flows to make use 
of the remaining space. For this purpose, it sorts and 
regroups packets by flow, and then puts them into queues 
depending on the volume of traffic in each flow. 

Advantageously, said scheduler means depend on a 
25 static and/or dynamic set of weights. 

By way of example, the static weights may come from 
conventional methods of sharing or allocating resources. 
The dynamic weights may be calculated on the basis of 
congestion control information. A combination of these 
30 two types of weighting can also be envisaged. 

In a particularly advantageous embodiment, said 
scheduler means depend on a first set of weights, each of 
said weights representing the percentage of said resource 
allocated to each of said nodes in said plurality of 
35 nodes. 

This type of weighting is obtained by conventional 
resource sharing or allocation methods. 
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Advantageously, said scheduler means depend on a 
second set of weights, each of said weights representing 
the relative weight of the traffic of each of said nodes 
relative to the total traffic. 
5 The present invention also provides a node including 

a scheduler device of the invention and having a 
plurality of queues for sending data to a plurality of 
destination nodes, and a plurality of outlet ports. 

The invention also provides a data transmission 
10 system comprising at least source node of the invention, 
said system further comprising: 

• a plurality of destination nodes; and 

• a plurality of resources. 

Other characteristics and advantages of the present 
15 invention appear from the following description of an 

embodiment of the invention, given by way of non-limiting 
illustration. In the figures: 

• Figure 1 is a diagram of a transmission system 
incorporating a first embodiment of the scheduler device 

2 0 of the invention; 

- Figure 2 is a diagram of a transmission system 
incorporating a second embodiment of the scheduler device 
of the invention; and 

• Figure 3 illustrates three-level arbitration. 
25 Figure 1 is a diagram of a transmission system 10 

such as an optical packet ring network. This 
representation is restricted to describing the invention, 
and the system may have numerous other elements. The 
system 10 comprises: 

3 0 -a source node 1; 

■ three destination nodes N^,, and N3; and 

• four resources OR^, ORj, OR3, and OR^ . 

By way of example, the resources OR-^, OR2, OR3, and 
OR4 are wavelengths multiplexed on an optical fiber using 
35 a dense wavelength division multiplex (DWDM) technique. 

By way of example, the nodes N^, N2, and N3 are 
optical packet add/drop multiplexers (OPADMs) . 
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By way of example, the source node 1 is an 
electronic concentrator such as an Ethernet switch. 
The source node 1 comprises: 

• three queues or buffers B^^, Bj, and B3 enabling 

5 packets to be stored before sending them respectively to 
the nodes N^, N2, and N3; 

• a scheduler device 2 also referred to as service 
discipline; and 

• four outlet ports P^^, Pj, P3/ and P4 enabling data 
10 packets to be sent respectively over the resources OR^^, 

OR2 , OR3 , and OR4 . 

The scheduler device 2 comprises four servers S^, 83 , 
S3, and S4 each associated with a respective one of the 
resources OR^^, ORj, OR3, and OR4 and with a respective one 
15 of the ports P^, P2, P3, and P4 . 

Each of the four servers S^, S2, S3, and S4 determines 
which packet associated with a particular queue is to be 
sent to a node via the resource associated with the 
server . 

2 0 The resources OR-^ and OR2 are shared by the nodes N^^ 

and N2 . 

The resource OR3 is shared by the nodes N2 and N3 . 
The resource OR4 is shared by the nodes N^^ and N3 . 
The resources are thus not shared uniformly by the 

2 5 nodes N-^, Ng, and N3 . 

Thus, a single resource used by a first node and by 
a second node need not be used in the same manner, with 
the first node making use of more other resources than 
the second node . 

3 0 For example, the node uses the resources OR^, OR2/ 

and OR4, while the node N3 uses only the resources OR3 and 
OR4 . The node can therefore use three resources while 
the node N3 can use only two. 

The resource allocation method thus takes account of 
35 this non-unif ormly distributed allocation and gives each 
of the nodes a weight corresponding to the percentage of 
the allocation of said resource to each of said nodes in 
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said plurality of nodes. This weighting is written in 
general manner as R^j and corresponds to the ratio 
allocated to node of resource ORj . 

In addition, the destination nodes may have weights 
5 that are different because of their traffic. Thus, if 

the traffic destined for node is written T^, then each 
node may be weighted by a coefficient equal to {T^/H^T^) 
where S^T^ designates the sum of the traffic to all of the 
nodes - 

10 Thus, each of the servers is given a series of 

weights referred to as "meta-weights" for each of the 
nodes taking account both of the asymmetrical sharing of 
the resources and the differing amounts of traffic for 
each of the nodes . 

15 These meta-weights are summarized in Table 1 below 

and each corresponds to the product of R^j multiplied by 



Servers /nodes 
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N3 
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W2 X R21 


W, X R3^ 


S2 




W2 X R22 


W3 X R32 


S3 


Wi X R^3 


W2 X R23 


W3 X R33 


S4 


Wi X Ri4 


W2 X R24 


W3 X R34 



Table 1 

20 

Each of said servers uses these meta-weights and 
proceeds independently of the other servers with a round 
robin type scheduling mechanism of the round robin type, 
of the weighted round robin (WRR) type, or of the 
2 5 weighted fair queuing (WFQ) type in order to select the 
queue and the packet (s) to be sent. The servers may 
comprise software means, hardware means, or a combination 
of both. 

The weights as described above can be updated 
30 statically or dynamically. Dynamic updating enables 
scheduling to adapt dynamically by taking account of 
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variation in loading as a function of time and of 
destination . 

In addition, the invention makes it possible to keep 
packets in order by eliminating any need for complex and 
5 expensive mechanisms or procedures for mitigating the 

consequences of loss of sequencing and for reorganizing 
packets. In order to ensure that packets are kept in 
order, it suffices that packet servicing complies with 
the established order by means of the servers making use 
10 of packet by packet parallel access (and not block 
access) . 

The invention is described above with reference to a 
set of weights representing the relative weights of 
traffic for each of the nodes compared with the total 
15 traffic, but other sets of weights may be used 

representing other parameters or characteristics of each 
of the nodes, such as types of service and/or of user. 
The weights may be applied in the form of meta-weights , 
as described above, but they can also be applied in the 
20 form of parameters that are separated in different 
levels . 

Figure 2 is a diagram of a transmission system 
incorporating a second embodiment of the scheduler device 
of the invention, having a plurality of stages L3 
25 corresponding respectively to a plurality of scheduling 
operations using different criteria. The network 10' is 
analogous to the network 10 described above. It differs 
in its scheduler device in the source node 1 * , and it 
comprises: 

3 0 • three queues or buffers B'^^, and B*3 serving 

to store packets before sending them respectively to the 
nodes N^, N2, and N3, each of these queues being provided 
with a flow level scheduler respectively referenced FLA^, 
FLA2, FLA3 to arbitrate between the flows F^^, .../ Fjj each 

35 heading for the same outlet from the node 1'; 
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• a node level scheduler device 2 ' which arbitrates 
between loads corresponding respectively to the different 
destinations as a function of bus capacities; and 

• four resource level scheduler devices RA-j^, 

5 RA3, and RA^ serving to take account of the way in which 
the nodes N.^, . . . / are connected to the resources OR^, 
OR2, OR3, and OR4. 

Figure 3 illustrates this three-level arbitration 
implemented in the scheduler device of node 1 ' as shown 
10 in Figure 2 . 

Naturally, the invention is not limited to the 
embodiments described above. In particular, the number 
of hierarchical levels may be greater than three. 

Specifically, the invention is described above in 
15 the context of an optical packet network, however it can 
be generalized to any type of system using resources that 
are shared asymmetrically, such as a computer system 
having a plurality of memory units (queues) connected to 
a plurality of processors (servers) via a plurality of 
20 resources (electronic circuits) organized as a read and 
write bus, the source node designating an individual 
component having said plurality of memory units. 

Similarly, the scheduling mechanisms may be 
different from those described. 



